I learn of NumPy while seeking a more effective way to read a text file. Therefore, I decide to put some notes here for convenient access.

1. Input and Output

1.1 Load data from a text file

np.loadtxt is used to load data from a simply formatted text file and return an array called ndarray. Note that each row in the text file must have the same number of values (no data is missing).

import numpy as np

np.loadtxt(fname,                   # file or str, File, filename, or generator to read. 
            dtype=<type 'float'>,   # data-type, data-type of the resulting array; default: float.  
            comments='#',           # str or sequence, the characters or list of characters used to indicate the start of a comment.
            delimiter=None,         # str, by default, this is any whitespace.
            converters=None,        # dict, mapping column number to a function. E.g., if column 0 is a date string: converters = {0: datestr2num}.  
            skiprows=0,             # int, skip the first skiprows lines 
            usecols=None,           # sequence, identify which columns to read. E.g, `usecols = (1,4,5)` will extract the 2nd, 5th and 6th columns.
            unpack=False,           # If True, the returned array is transposed, so that arguments may be unpacked using x, y, z = loadtxt(...).
            ndmin=0)                # int, the returned array will have at least ndmin dimensions. The reLegal values: 0 (default), 1 or 2.

1.2 Save an array to a text file

np.savetxt is used to save an array to a text file.

np.savetxt(fname,           # filename or file handle, '.gz' is automatically saved in compressed gzip format. 
            X,              # array_like, data to be saved to a text file. 
            fmt='%.18e',    # str or sequence of strs, e.g. [‘%.3e + %.3ej’, ‘(%.15e%+.15ej)’] for 2 columns (a list of specifiers, one per column) 
            delimiter=' ',  # str 
            newline='\n',   # str, string or character separating lines. 
            header='',      # str 
            footer='',      # str 
            comments='# ')  # str, string that will be prepended to the header and footer strings, to mark them as comments.

1.3 An example

Here is an example that shows how to read and write a CSV file with NumPy.

header = ['hopcount', 'r1', 'r2', 'r3', 'r4', 'r5']
fname = 'hopcount_created.csv'

# write to a file
np.savetxt(fname, table, fmt='%d', delimiter=',', header=','.join(header)) # the type of header is str 

# read from a file
table = np.loadtxt(fname, dtype=int, delimiter=',') # header isn't read in

The contents of hopcount_created.csv,

# hopcount,r1,r2,r3,r4,r5
1,4900,4834,4836,4860,4860
2,13894,13244,13254,13607,13724
3,30155,25789,25804,28619,29423
4,55506,42344,42289,51220,53589
5,77348,54515,55165,70919,74873
6,80973,58049,59442,75744,79389
7,68230,55699,57578,67883,68917
8,50773,51123,52449,54718,52662
9,37409,46793,47001,41331,38791
10,23385,38989,38149,27519,24708
11,13484,31091,28411,16723,14412
12,6541,21284,19205,8318,7020
13,2808,11936,10906,3664,3006
14,963,6051,6007,1223,995
15,268,2914,3444,289,268
16,46,1279,1750,46,46
17,0,516,692,0,0
18,0,189,257,0,0
19,0,44,44,0,0

2. genfromtxt

numpy.genfromtxt(fname, dtype=<type 'float'>, comments='#', delimiter=None, skip_header=0, skip_footer=0, converters=None, missing_values=None, filling_values=None, usecols=None, names=None, excludelist=None, deletechars=None, replace_space='_', autostrip=False, case_sensitive=True, defaultfmt='f%i', unpack=None, usemask=False, loose=True, invalid_raise=True, max_rows=None)

3. Indexing

Indexing

The basic slice syntax is start:stop:step

x[obj] There are three kinds of indexing available: field access, basic slicing, advanced indexing. Which one occurs depends on obj.

Field access

import numpy as np

table = np.array([[ 1,  2,  3],
                  [ 4,  5,  6],
                  [ 7,  8,  9],
                  [10, 11, 12]])

# get the ith row
>>> table[2]
array([7, 8, 9])

# get the ith column
>>> table[:,2]
array([ 3,  6,  9, 12])

# get an element
>>> table[2][2]
9

# get a range of rows and columns
>>> table[2:4, 1:3]     # take the 2-3 rows of the 1-2 columns
array([[ 8,  9],
       [11, 12]])

# get a subarray with specific rows and columns
>>> table[[[1],[3]], [0,2]]   # take the 1, 3 rows of the 0, 2 columns 
array([[ 4,  6],
       [10, 12]])

>>> table[[1,3], [[0],[2]]]   # ?
array([[ 4, 10],
       [ 6, 12]])

# Advance indexing
>>> table[[1,3], [0,2]]       # the row index is [1, 3]; the column index specifies the element to choose for the corresponding row. Select the 0 column for the 1 row and the 2 column for the 3 row.
array([ 4, 12])

>>> table[[1,3], [0]]
array([ 4, 10])
本文系Spark & Shine原创,转载需注明出处本文最近一次修改时间 2022-03-17 22:40

results matching ""

    No results matching ""